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The process of 'Evolutionary Diffusion', i.e. reproduction with local mutation but without selec- 
tion in a biological population, resembles standard Diffusion in many ways. However, Evolutionary 
Diffusion allows the formation of localized peaks that undergo drift, even in the infinite population 
limit. We relate a microscopic evolution model to a stochastic model which we solve fully. This 
allows us to understand the large population limit, relates evolution to diffusion, and shows that 
independent local mutations act as a diffusion of interacting particles taking larger steps. 

PACS numbers: 87.23.-n, 02.50.-r, 02.50.Ey,05.40.-a 



Reproduction involving random mutations seems at 
first to lead to a diffusion of the population in type space, 
however the diffusion involved is anomalous in various 
ways. A localized configuration that we call a 'peak' 
forms in type space [l], Q, and diffuses as a single en- 
tity. The variations in the peak width increase as the 
peak width itself with increasing population size, render- 
ing the infinite population limit meaningless. In contrast, 
the distribution of a large number of non-interacting par- 
ticles undergoing local diffusion forms a Normal Distribu- 
tion with width increasing in time. We will argue that a 
completely solvable stochastic differential equation model 
captures the same dynamics as the microscopic evolu- 
tion process, and provides a meaningful description for 
the large population limit. We show that although mu- 
tations are independent, the effective diffusion is not. 

Much previous work on the clustering of individuals 
in type space focuses on the genealogical lineage. Ref. 
provides a comprehensive discussion and a complete 
solution from this viewpoint. We imagine a population 
of fixed size N, and in each generation, some individuals 
can expect to have many offspring and others will have 
none. After some time the whole population will have 
the same common ancestor, by the process of Gamblers 
ruinQ, and hence must have similar type. 

Lineage analysis is a good tool to study high dimen- 
sional genotype spaces. The theory of Critical Branching 
Processes^ finds that in high dimensions {d > dc, where 
the critical dimension dc = 2 Q) describing genotype 
space, birth/death dynamics are described fully by the 
lineages. A lineage remains distinct until all individuals 
in it die. However, in low dimensions {d < dc) describ- 
ing phenotypes, additional clustering within a distribu- 
tion occurs. Although sometimes distinct, the clusters 
in phenotype space can merge, and hence clusters are 
poorly defined entities. Instead, a careful average over 
the distribution that we call a 'peak' provides a more 
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useful description. Low dimensional clustering due to 
birth-death processes was previously only understood in 
real space 0, Q, with neutral phenotype clustering ad- 
dressed indirectly 0, [l^ . 

The clustering described above is fiuctuation driven. 
Fluctuations must be considered in evolution unless the 
number of individuals per type is high[llj, or there is 
strong selection Otherwise, there is always a region 
in type space in which the population is small, and there- 
fore there is an area of the equilibrium distribution that 
is affected by noise. It is (only) in the fluctuations that 
Evolutionary Diffusion differs from normal Diffusion. 

Understanding neutral evolution (i.e. reproduction 
with mutation but without selection) is of great impor- 
tance due to its wide iisage in numerous contexts, from 
Genealogical Trees [H, [T4Tl5| , to models of mutations in 
RNA (igIITtI. Neutral models provide good matches with 
observed Species- Area Relations and Species- Abundance 
Distributions [l^ . 

Microscopic model: We are interested in the distribu- 
tion of types in a population of individuals as they evolve. 
For comparison to Diffusion, we assume that the total 
population N(t) ~ N is constant, a restraint that can 
easily be relaxed. In addition, we use the simplest type 
space, namely the 1-dimensional set of integers. How- 
ever, the qualitative behavior discussed will remain the 
same in all large connected type spaces. The timestep 
for the microscopic processes we consider are: 

The Diffusion Process: 

1. Select an individual i (at position x), each with 
probability 

2. Move to y = X ± 1 each with probability Pm/'^, or 
leave sX y ~ x with probability 1 — p„i. 

The Evolutionary Diffusion process (which is the 
Moran process [19] for a type distribution): 

1. Select an individual i (at position x), each with 
probability and mark for killing. 

2. Select an individual j (at position Xj) for reproduc- 
tion, each with probability 

3. Remove individual i, and create an offspring of in- 
dividual j &t y = Xj with probability (1 — Pm), or 
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mutate to y = Xj zt 1 each with probabihty 'Pml'^- 
Hence the effective diffusive step is y — x. 

We will refer to properties of the Diffusion process with 
the subscript D, and the Evolutionary Diffusion process 



with the subscript E, e.g. 



r,{t) for the mean posi- 



tion of the individuals in the evolution process after t 
timesteps. Time is best measured in generational time 
T = t/N. Care is needed when averaging: we will use the 
ensemble average (over many realizations) of a quantity 
V V{t), population average {V){t) = J2^V^{t)/N and 
time average up to time r: {V) = J2t=to ~ ^o)- 

Quantities calculated from probabilities are by definition 
ensemble averages, and so the notation refers to which 
average is taken first. See Q for further details. 
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FIG. 1: A snapshot of the distribution after 80000 genera- 
tions, using A'' — IQQQO and p™, — 0.5, comparing Evolution- 
ary Diffusion (grey line) and Diffusion (black line). Diffusion 
follows a (noisy) Normal distribution whereas Evolutionary 
Diffusion is localized as 2 clusters, which we combine as a 
'peak' of width w and position fi undergoing drift. 

The number of individuals on site x is n{x,t), and the 
initial conditions are n(x = 0,t — 0) = N , n{x, t) = Q for 
X ^ Q. The ensemble average of the population distri- 
bution fl(x, t) is obtained directly from the Master equa- 
tion, and is identical for both Diffusion and Evolutionary 
Diffusion: 



Prn_ 

2N 



n{x,t). 



(1) 



Hence the (one-point) ensemble average of the two pro- 
cesses is the same, but numerical simulations (Fig. [T]) 
reveals very different behavior. From the figure, we see 
that Diffusion has followed the ensemble average: a Nor- 
mal distribution centered on 0, increasing in width with 
time[20]. Although we shall see that the Evolutionary 
Diffusion process self-averages over time, the thermody- 
namic limit is subtle. In order to understand why, we now 
split the peak up into its mean position and standard de- 
viation to create a "Theory of evolutionary peaks" . 

Theory of evolutionary peaks: We define here concep- 
tually simple and solvable processes of Evolutionary Dif- 
fusion and Diffusion which we argue captures the essen- 
tial features of the microscopic models. The distribution 
is described as a 'peak': a Normal distribution with mean 



li{t) and standard deviation (i.e. width) w(t), which vary 
as a product of the dynamics. The probability distribu- 
tion is continuous, but a discrete 'individual' of size 
is moved per timestep. Although a given realization of 
a peak never resembles a Normal distribution, this is a 
good model of the evolutionary process because a Normal 
distribution is a good approximation for the time aver- 
age of the peaks in the variable x' = x — fj,(t) (we now 
drop the dash notation); see Fig. [2l We hence 'integrate 
out' the inessential degrees of freedom: the particular 
distribution of individuals within the peak. 
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FIG. 2: Time-averaged Evolutionary Diffusion distribution 
(solid line). Normal distribution (dashed line) with standard 
deviation calculated from theory in Eq. (|17|l . The two agree 
up to the second moment. Also shown is a snapshot of the 
distribution (thin line). {N = 1000, Pm = 0.5) 

In the evolutionary process, in each timestep a death 
will occur at any point x in the distribution p{x) : 



PE{x■,^i = 0,w) 



2ttw 



(2) 



The parent position Xj will be drawn independently from 
the same distribution, and the offspring will be mutated 
with probability Pm to y ± 1. Hence the distribution for 
births p{y) is: 



= 0, w) = (1 -Prn) 



2ttw 



(3) 



:,-{V + lf/2v 
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The probability distribution for the Diffusion process, 
moving a particle at x to a; ± 1 with probability is 
written as: 

Pd{x;h = 0,w)= — — — , (4) 

(5) 



27rw 

Poiy; /i = 0, w) =(1 - p,n)S{y - x) 

Pti 



+ ^[6iy~x + l) + S{y~x-l)]. 

The expectation value of a variable V{x,y) is simply the 
integral of V over the probability distribution: 



{Vix,y)) 



OO /'OO 



V{x,y)p{x)p{y)dydx. (6) 



QO J — OO 
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Eq. ([6]) is simple to calculate because all of our proba- 
bilities are independently Normal distributed, or interact 
trivially via delta functions. 

We now perform calculations for the expectation values 
of w{t + 1) given w{t), (working with the variance w'^ for 
simplicity). We consider the death of individual q ai Xq, 
which is replaced by a birth occurring at yq. 



N / N \ • 

i=l \i=l / 



N „2 ^2 / N 



2^ Iv 



_ Vq ~ Vq^^'q^ '^^qUq 



N 



iV2 



(7) 

(8) 
(9) 



Here we have defined F ~ Aw'^{t) for later use, and used 
~ 0- These quantities are population averages; 
we now ensemble average over the possible births and 
deaths by simple integration over Eq. ([6|). We find that 
for the Diffusion process, the expected change in the vari- 
ance is always positive and independent of w: 



At 



N 



(10) 



For the evolution process, the expected change in the 
variance is: 



2^1 
N 



(11) 



where for brevity we have defined p* — p„j — (as- 
sumed positive). This time, the rate of change of the 
variance depends on itself, and there is an equilibrium 



for which E{Aw^{x,y)) = 0, at w 



equil 
E 



y/Np*/2. The 



product Np* is the average number of mutants per gen- 
eration, minus one. By taking the limit At — > in Eq. 
(jlip . and solving by separation of variables, we obtain 
the variance (w|)(r) = ^(1 - g-^T/JV). 

We now look at how the peak width w varies in time, 
by considering the fluctuations in F = Ajw^, the change 
of peak size. We are interested in fluctuations around 

^equil -g 



.equil 



the equilibrium standard deviation w' 
the mean observed value of w - we will be able to correct 
it by considering higher moments. We will now assume 
a large population iV 1, and consider the reduced 
variable s — to identify leading order terms. 



F2 



. F^ ^ + 4p™sViV + 



(12) 



To represent the particular history of the evolution pro- 
cess we must write Eq. (Ilip with an additional noise term 



- F rj{t) w {2w^/N)ri{t), where r]{t) has mean zero 
and standard deviation 1 (keeping up to second order mo- 
ments in the noise - higher moments are 0{1/N) smaller). 
In generational time T — t/N, as AT — > we obtain: 



dwUT) 



P 



2uP_ 

N 



2w^ 

dT + -=dW. (13) 



Where W{t) is a Wiener process [20|- We solve by finding 
the Fokker-Planck equation [IH : 



dp{w^,T) d{\p* - 2w^/N]p{w'^, T)) 



dT 



9(u;2) 
ld^i'iw'^p{w'^,T)/N) 

2 9(^2)2 



= 0. 



(14) 



Seeking the steady state solution ^^'■g^''^-* — 0, integra- 
tion twice shows that (for this to be a probability distri- 
bution) the unique solution is: 



p{w^)d{w'^) 



Np* 



-^^1 2 



p{w)dw — -e dw. 



(15) 
(16) 



The tail of p{w) is a power law, corresponding to the 
existence of multiple (arbitrarily distant) clusters within 
the peak. From this we can calculate the arithmetic mean 
of the peak width, corrected for noise: 



(w) 



wp{w)dw 



Np* 



(17) 



This contrasts with Diffusion, as (wd) has no stationary 
distribution and follows Eq. (fTU]) . The standard devia- 
tion of the peak width is: 



<Ju. = Vi^^) - {wV = y/Np*il~n/4)/2. (18) 

Therefore the standard deviation in the peak width in- 
creases at the same rate (with N) as the peak width itself. 
The 4th and higher moments of the distribution of peak 
widths diverge due to the power law tail of p{w). The 
model approximations are confirmed by numerics. Both 
Eq. and w{t) for the 'Evolutionary Diffusion Pro- 

cess' defined initially have indistinguishable signals and 
Power Spectra (not shown), and conform to Eq. ([TT]) to 
within 2%: for N = 10000 and p„ = 1, with 200 runs 
of 10^ generations, counting w{t) after time 5 x lO'', we 
find {w) = 64.34 ± 2.14 for the Evolutionary Diffusion 
Process, {w) = 63.17±1.20 for Eq. comparing with 

a theoretical prediction of {w) — 62.66. Eq. (fT3|) is fast 
to simulate for long times and, as indicated, behaves very 
similarly to the microscopic process. 

We now examine the behavior of the expected root- 
mean-square (RMS) displacement of the peak center as 
a function of time; direct integration of ((Acc)^) = {{xq — 
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UqY /N"^), using the steady state value {-0}^) = Np*/2 in 
Eq. (??), yields the following step size for evolution: 

A(x)f ^^(t) « v^V^. (19) 

From random walk theorv[20j. the mean (RMS) position 
of a random walker taking steps of size 5* after t timesteps 
is (x)^*^^ = SVt- Hence: 

(x) l^'^ (T) - V^t/N - ^JpraT/N, (20) 

(x)^*'^^(r) = y/p*t/N = (21) 

Hence, in the limit of infinite N the Diffusion process 
remains stationary, but in generational time the mean 
position of the Evolutionary Diffusion process does a ran- 
dom walk of step size independent of the total number of 
individuals. 

For completeness we could write an equation 
for /i(T) = (x) for evolution as: d^E{T) — 
N^^l"^ ^ Pra + 1w\,{t)dW . This equation together with 
Eq. [13] describe the system fully and are completely 
solved once the peak width reaches equilibrium proba- 
bility distribution. 

We have described the microscopic behavior of the evo- 
lution of reproducing individuals in a type space, and 
approximated it to two coupled solvable stochastic pro- 
cesses for the distribution. We find two main differences 
between Evolutionary Diffusion and normal Diffusion. 1) 



The short range mutation process effectively becomes a 
longer ranged (by Oi^pN)) diffusive step. By the Central 
Limit Theorem, the standard deviation of the mean po- 
sition /i taking N steps per generation of size A increases 
as AypN . In diffusion, the steps are of size A — 1/N, 
but in evolution the steps are of size A ~ \/ \/N so the 
convergence is not fast enough to set the location of the 
peak center in the infinite population limit. 2) The ef- 
fective diffusion is not independent and peaks can form 
with fluctuating width w around {w), following the dis- 
tribution in Eq. (|16p which has a power law tail. This 
provides a null hypothesis to determine if two asexual in- 
dividuals belonging to different clusters of a phenotype in 
fact are subject to the same selection pressure, i.e. mem- 
bers of a single neutrally evolving population or 'peak', or 
whether differential selection is responsible for the popu- 
lation breaking up into separate clusters. In the neutral 
case all but one cluster will go extinct. However, if dif- 
ferential selection acts then several clusters of phenotype 
may survive in separate 'niches'. 

In terms of replicator dynamics, our results transpar- 
ently explain how a 'species' in type space (the peak de- 
scribed above) is able to maintain its coherence as it per- 
forms a random walk due to mutation prone reproduc- 
tion. We found that the distribution of a phenotype in 
neutral evolution is 'non-trivial' regardless of population 
size. In terms of diffusion, we describe an interesting type 
of particle interaction that allows for clustering. 

DL is funded by an EPSRC studentship, and is grateful 
to Nelson Bernardino for useful discussions. 
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